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In this note we present studies of coverage and power for confidence intervals for a Poisson process with known 
background calculated using the Likelihood ratio (aka Feldman & Cousins) ordering with Bayesian treatment of 
uncertainties in nuisance parameters. We consider both the variant where the Bayesian integration is done in both 
the numerator and the denominator and the modification where the integration is done only in the numerator whereas 
in the denominator the likelihood is taken at the maximum likelihood estimate of the parameters. Furthermore we 
discuss how measurements can be combined in this framework and give an illustration with limits on the branching 
ratio of a rare B-meson decay recently presented by CDF/DO. A set of CH — h classes has been developed which can 
be used to calculate confidence intervals for single or combining multiple experiments using the above algorithms and 
considering a variety of parameterizations to describe the uncertainties. 



1 Introduction 

A popular technique to calculate confidence intervals 
in recent years is the technique suggested by Feldman 
& Cousins 1 . The method consists of constructing an 
acceptance region for each possible hypothesis (in the 
way as proposed by Neyman 3 ) and fixing the lim- 
its of the region by including experimental outcomes 
according to rank which is given by the likelihood 
ratio : 



C{n\s + b) 
£(n\sbest + b) 



(1) 



where s is the hypothesis, n the experimental out- 
come, b the expected background, Sbest is the hy- 
pothesis most compatible with n and C the Like- 
lihood function. The expected background b is an 
example for a so called nuisance parameter., i.e. a 
parameter which is not of primary interest but which 
still affects the calculated confidence interval. An- 
other example of such a nuisance parameter could 
be the signal efficiency. In the originally proposed 
method by Feldman & Cousins, only the presence of 
background was considered and it was assumed to be 



exactly known. The question on how to treat uncer- 
tainties in nuisance parameters in confidence interval 
calculation, in particular in context of the frequentist 
construction has drawn considerable attention in the 
recent years. In 1992 Cousins & Highland 2 proposed 
a method which is based on a Bayesian treatment of 
the nuisance parameters. The main idea is to use a 
probability density function (pdf) in which the aver- 
age is taken over the nuisance parameter: 

P(n\s,e) — ► J P(n\s,e')P(e'\e)de' :=q{n\s,e) 

(2) 

where e' is the true value of the nuisance parame- 
ter, e denotes its estimate and s and n symbolize 
the signal hypothesis and the experimental outcome 
respectively. 

Cousins & Highland only treated the case of 
Gaussian uncertainties in the signal efficiency. The 
method has since been generalized by Conrad et 
al. 4 to operate with the Feldman & Cousins order- 
ing scheme and taking into account both efficiency 
and background uncertainties as well as correlations. 
This generalized method has already been used in 



"throughout this note we consider Poisson distributions with experimental outcome n, hypothesis parameter s and (possibly not 
exactly) known background b 
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a number of particle and astroparticle physics ex- 
periments (see references in Tegenfeldt & Conrad 5 ). 
FHC 2 denotes this generalized method in the remain- 
der of this note. 

In case of significantly less events observed than 
expected background, FHC 2 tends to result in con- 
fidence intervals which are becoming smaller with 
increasing uncertainties. Hill 6 therefore proposed a 
modification where in the ordering the likelihood ra- 
tio is defined as: 



£(max (0, n obs - b) + b) 

here b is the maximum likelihood estimate of b given 
the subsidiary observation of b. MBT ("Modified 
Bayesian Treatment") denotes this modification in 
the remainder of this note. 

In this contribution, we discuss coverage and 
power of these two methods as well as the combina- 
tion of different experiments with and without cor- 
relations. We start by introducing the C++ library 
which has been developed to be able to do the nec- 
essary calculations. 

2 POLE++ 

For the coverage studies presented in this paper a 
reasonably fast and efficient code is required. Hence, 
a user-friendly and flexible C++ library of classes 
was developed based on the FORTRAN routine pre- 
sented by Conrad 8 . The library is independent of ex- 
ternal libraries and consists of two main classes, Pole 
and Coverage. The first class takes as input the num- 
ber of observed events, the efficiency and background 
with uncertainties and calculates the limits using the 
method described in this paper. The integrals are 
solved analytically. Coverage generates user-defined 
pseudo-experiments and calculates the coverage us- 
ing Pole. Presently the library supports Gauss, log- 
Normal and flat pdf for description of the nuisance 
parameters. Several Experiments with correlated or 
uncorrelated uncertainties in the nuisance parame- 
ters can be combined. The pole++ library can be 
obtained from http://cern.ch/tegen/statistics.html 

3 Coverage and Power 

The most crucial property of methods for confidence 
interval construction is the coverage, which states 



that a fraction (1-a) of infinitely many repeated ex- 
periments should yield confidence intervals that in- 
clude the true hypothesis irrespective of what the 
true hypothesis is. 

For a confidence interval construction (accord- 
ing to Neyman) without uncertainties in nuisance 
parameters this property is fulfilled by construction. 
In the present case however, we have to test the cov- 
erage employing Monte Carlo experiments. 

Power on the other hand is a concept which is de- 
fined in the context of hypothesis testing: the power 
of a hypothesis testing method is the probability that 
it will reject the null hypothesis, so, given that the 
alternative hypothesis s trMe is true. This concept is 
rather difficult to generalize to confidence intervals 
since the alternative hypothesis is not uniquely de- 
fined. We use the following definition for power: 

e) (4) 

n^Acc(so) 

and view power as a function of s trU e- Acc(so) here 
denotes the acceptance region of sq. This seems an 
intuitively appealing measure: given the choice be- 
tween different methods, the one should be taken 
which has minimally overlapping acceptance regions. 

Typical examples of the coverage as function of 
signal hypothesis are shown in figure 1. It can be seen 
that the introduction of a continuous variable leads 
to a considerable smoothing of the coverage plot. A 
modest amount of over-coverage is introduced, sim- 
ilarly for the MBT method and the FHC 2 method. 
For high Gaussian uncertainties in efficiency (~ 40 
%) the over-coverage of MBT is less pronounced than 
for FHC 2 . More detailed coverage studies of the 
FHC 2 method have been presented by Tegenfeldt & 
Conrad 5 . The power of the FHC 2 and MBT meth- 
ods is compared in figure 1 for 40 % uncertainties in 
the efficiency. FHC 2 as higher power for hypotheses 
rather far away from the null hypotheses. This is 
true only for large signals and comparably large un- 
certainties (and for not too large differences between 
so and s t rue), otherwise differences are negligible. 

4 Combining different experiments 

The combination of experiments can be divided into 
two cases. The simpler case is the one of completely 
uncorrelated experiments: in this case the pdf used 
in the construction are given by a multiplication of 
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Figure 1. Examples for the coverage and power of the discussed methods. Upper most figure: coverage of the FHC method 
assuming a 5 % and 40 % Gaussian uncertainties in efficiency. Middle figure: the coverage for the FHC 2 method compared to 
the MBT method for 40 % Gaussian efficiency uncertainties. Lowest figure: the power of the two methods compared for 40 % 
Gaussian uncertainties in efficiency. 
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the pdfs of the single experiments: 

q(n\s) = Y[ q(n t \s,€i) (5) 
i=i 

If correlations between uncertainties in nuisance pa- 
rameters have to be considered, multivariate pdfs 
have to be employed: 

poo poo n exP nexp 

q(n\s, e) = ... [ J P(n\s, e^)P(e'\e) JJ de' t 
J o Jo j=1 i=1 

(6) 

We illustrate the effect of combining different exper- 
iments with the example of the CDF limit on the 
branching ratio for B® — > fi + yT , sec table 1. In 
this case, two CDF data sets are combined with an 
uncorrelated uncertainty in the background expecta- 
tion and an uncertainty in the efficiency which can be 
factorized into a correlated and uncorrelated part 7 . 
Bcrnhard et. al. 7 presented a fully Bayesian combi- 
nation, which is included in the table for comparison. 
The limit obtained using the FHC 2 method is slightly 
smaller than the fully Bayesian limit. 

Table 1. The CDF single and combined limits on — > ^t + /x~ 
calculated by FHC 2 . CDF1 and CDF2 denote the two differ- 
ent data sets used for single limits. The quoted uncertain- 
ties are for the single experiments, the efficiency uncertainties 
change to 13.1 and 11.1 % for the uncorrelated part if exper- 
iments are combined. The number in the parentheses is the 
result of the purely Bayesian calculation 7 . 





CDF 1 CDF 2 


eff. uncertainty [%] 


18.2 16.0 


eff. uncertainty [%] 


20.3 19.2 


corr. eff. uncertainty. [%] 


15.5 


95 % CL [1CT 7 ] 


2.5 4.3 


95 % comb.[lCr 7 ] 


1.7 (2.0) 



5 Discussion &: Conclusion 

There are two main caveats when interpreting the 
presented results: first of all, the methods (more 
or less implicitly) assume a flat prior probability for 
the true nuisance parameter. Thus, conclusions on 
the coverage and power are true only for that prior. 
This assumption seems particularly harmful in case 
of combined experiments, a case for which we did 
not calculate the coverage. Results presented at this 
conference by Hcinrich 9 indicate that the assump- 
tion of a flat prior for nuisance parameters in each 
channel leads to significant under-coverage for fully 



Bayesian confidence intervals. Heinrich also shows, 
that this behavior can be remedied with an appropri- 
ate choice of prior (in his particular example: 1/e). 
For the methods presented here this might imply 
that there is under-coverage in case of several com- 
bined experiments. A second caveat, is that we test 
the coverage only for 90% confidence level. At this 
conference Cranmer 10 presented results that indicate 
under-coverage for very high confidence levels (> 5 
a) if uncertainties in the background are treated in 
the Bayesian way. Tests of coverage for high confi- 
dence levels and combined experiments are currently 
under way. With these caveats in mind, we con- 
clude that Bayesian treatment of nuisance parame- 
ters introduces a moderate amount of over-coverage. 
The MBT method has less over-coverage for the case 
with large Gaussian uncertainties in the signal effi- 
ciencies. We also compared the power of the two 
suggested methods. For large uncertainties and large 
true signals, the FHC 2 method has higher power for 
hypotheses relatively far away from the null hypoth- 
esis. 
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